* Jon Temple and Nicolas Van de Sijpe, 29-May-2017
* Analysis of robust IV distances.


*************************************************************
* SET UP DO-FILE
*************************************************************
version 14
clear all
set more off, perm
set matsize 2000
local aidinstrument "C:\Users\ec1nv\Dropbox\My Documents\My Papers\Aid and absorption"
local aidinstrumentdata "`aidinstrument'\Data"
local ouranalysis "`aidinstrumentdata'\Results\Our analysis"


*************************************************************
* BRING IN DATA AND SET UP DATA
*************************************************************
use "`ouranalysis'\Outliers\distancesdata.dta"
d, fulln
order country ccode obsnum
/* Identify distances as coming from static model. */
foreach var of varlist cd-mxd {
	rename `var' `var'stat
}
d, fulln
sort obsnum

/* Merge in distances from dynamic model. */
merge 1:1 obsnum using "`ouranalysis'\Outliers\dynamicdistancesdata.dta"
drop _merge // perfect merge
d, fulln
/* Identify merged in distances as coming from dynamic model. */
foreach var of varlist cd-mxd {
	rename `var' `var'dyn
}
d, fulln


/* Generate maximum distances across the different models. */
gen maxdstat = max(cdstat,gdstat,tdstat,idstat,xdstat,mdstat,mxdstat)
order country ccode obsnum cdstat gdstat tdstat idstat xdstat mdstat mxdstat maxdstat
gsort -maxdstat
gen statoutlier = maxdstat > 20
order country ccode obsnum cdstat gdstat tdstat idstat xdstat mdstat mxdstat maxdstat statoutlier
list country ccode obsnum if statoutlier == 1
*browse
*browse country maxdstat
/* Using a cutoff of 20 identifies observations connected to the following 6 countries:
Jordan, Madagascar, Chad, Congo, Dem. Rep., Central African Rep., Mauritania. 
Looking below the 20 cutoff, the next 4 distances are also for observations of these countries.
After that, the next distance is Burundi (distance = 18.3), which the dynamic model will 
highlight more clearly as an outlier, so that makes 7 outlying countries in total.
Another 2 observations follow that are connected to the same 6 countries listed above,
then Nicaragua with a maximum distance of 17.4. 
The following 9 distances are again from the 7 countries listed above,
and then we're down to position 33 with Guyana and a maximum distance only just above 15.
Then again 8 observations linked to the other countries listed above, including 3 for Burundi.
Even a cutoff of 15 would only add 2 countries, Nicaragua and Guyana, 
and only because of a single observation for each country.
Hence, 20 cutoff seems like the most natural one. */


/* For the dynamic model. */
gen maxddyn = max(cddyn,gddyn,tddyn,iddyn,xddyn,mddyn,mxddyn)
gsort -maxddyn
gen dynoutlier = maxddyn > 20
list country ccode obsnum if dynoutlier == 1
*browse country ccode obsnum cddyn gddyn tddyn iddyn xddyn mddyn mxddyn maxddyn dynoutlier
*browse country maxddyn
/* Using a cutoff of 20 identifies observations connected to the following 7 countries:
Jordan, Madagascar, Congo, Dem. Rep., Chad, Central African Rep., Mauritania, Burundi.
Below the 20 cutoff, the next 11 observations are also for these countries;
then it is Nicaragua with a maximum distance of 17.1; 
then another 4 from the list of 7 and then we're already at 
Guinea-Bissau in 34th place with a maximum distance of 15.9. */


/* Now for the maximum across all models -- static and dynamic. */
gen maxd = max(maxdstat,maxddyn)
gsort -maxd
gen outlier = maxd > 20
list country ccode obsnum if outlier == 1
*browse country ccode obsnum maxdstat statoutlier maxddyn dynoutlier maxd outlier
*browse country maxd
/* Using a cutoff of 20 identifies 18 observations connected to the following 7 countries:
Jordan, Madagascar, Chad, Congo, Dem. Rep., Central African Rep., Mauritania, Burundi.
Below the 20 cutoff, the next 11 observations are also connected to these countries;
then it is Nicaragua with a maximum distance of 17.4; then another 4 from the list of 7 
and then we're already at Guinea-Bissau in 35th place with a maximum distance of 15.9.
20 cutoff identifies 7 countries that are the clearest outliers, typically with multiple observations. */

